Clustered SVD strategies in latent semantic indexing
نویسندگان
چکیده
The text retrieval method using Latent Semantic Indexing (LSI) technique with truncated Singular Value Decomposition (SVD) has been intensively studied in recent years. The SVD reduces the noise contained in the original representation of the term-document matrix and improves the information retrieval accuracy. Recent studies indicate that SVD is mostly useful for small homogeneous data collections. For large inhomogeneous datasets, the performance of the SVD based text retrieval technique may deteriorate. We propose to partition a large inhomogeneous dataset into several smaller ones with clustered structure, on which we apply the truncated SVD. Our experimental results show the the clustered SVD strategies may enhance the retrieval accuracy and reduce the computing and storage costs.
منابع مشابه
Clustered SVD strategies in latent semantic indexing q
The text retrieval method using latent semantic indexing (LSI) technique with truncated singular value decomposition (SVD) has been intensively studied in recent years. The SVD reduces the noise contained in the original representation of the term–document matrix and improves the information retrieval accuracy. Recent studies indicate that SVD is mostly useful for small homogeneous data collect...
متن کاملSparsification Strategies in Latent Semantic Indexing
The text retrieval method using Latent Semantic Indexing (LSI) with the truncated Singular Value Decomposition (SVD) has been intensively studied in recent years. The term-document matrices after SVD are full matrices, although the rank is reduced substantially. To reduce memory consumption, we examine some strategies to sparsify the truncated SVD matrices. After applying the sparsification str...
متن کاملClustering and Latent Semantic Indexing Aspects of the Singular Value Decomposition
This paper discusses clustering and latent semantic indexing (LSI) aspects of the singular value decomposition (SVD). The purpose of this paper is twofold. The first is to give an explanation on how and why the singular vectors can be used in clustering. And the second is to show that the two seemingly unrelated SVD aspects actually originate from the same source: related vertices tend to be mo...
متن کاملA Novel Study for Summary/attribute Based Bug Tracking Classification Using Latent Semantic Indexing and Svd in Data Mining
This paper presentsa Latent Semantic Indexing (LSI) method for learningBug tracking concepts in document data. Each attribute in a vector provides the mark of participation of the document in data or term in the parallel concept .The objective to describe the concepts summary based, but to be capable to signify the documents and relations in a combined way for showing document-similarity, docum...
متن کاملConcept Lattice Generation by Singular Value Decomposition
Latent semantic indexing (LSI) is an application of numerical method called singular value decomposition (SVD), which discovers latent semantic in documents by creating concepts from existing terms. The application area is not limited to text retrieval, many applications such as image compression are known. We propose usage of SVD as a possible data mining method and lattice size reduction tool...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Manage.
دوره 41 شماره
صفحات -
تاریخ انتشار 2005